Integration and Modularity

Antigoni Kaliontzopoulou, CIBIO/InBIO, University of Porto

17 October, 2019

Integration and Modularity: Basic Concepts

Integration and Modularity: Basic Concepts

Olson and Miller. (1958). Morphological Integration.

Integration

Integration

Modularity

Levels of Modularity

Klingenberg. (2008). Ann. Rev. Ecol. Evol. Syst.

Integration and Modularity: Conceptual Considerations

Assessing Patterns for Blocks of Traits

Evaluating Integration

1: Conditional Independence

1: Conditional Independence

1: Conditional Independence: Example

2: Integration Among Hypothesized Modules

2: Integration Among Hypothesized Modules

\(\small\mathbf{S}_{11}\): covariation of variables in \(\small\mathbf{Y}_{1}\)

\(\small\mathbf{S}_{22}\): covariation of variables in \(\small\mathbf{Y}_{2}\)

\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\): covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)

\(\small\mathbf{S}_{21}=\mathbf{S}_{12}^{T}\) is the multivariate equivalent of \(\small\sigma_{21}\)

2: Integration Among Modules: The RV Coefficient

\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\] Range of \(\small\mathbf{RV}\): \(\small{0}\rightarrow{1}\)

Note: ‘tr’ signifies trace (sum of diagonal elements)

The RV coefficient is analogous to \(\small{r}^{2}\) but it is not a strict mathematical generalization

\(\small{r}^{2}=\frac{\sigma^{2}_{xy}}{\sigma^{2}_{x}\sigma^{2}_{y}}\) vs. \(\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\)

The numerator of \(\small{r}^{2}\) & \(\small{RV}\) describes the covariation between \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)

The denominator of \(\small{r}^{2}\) & \(\small{RV}\) describes variation within \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)

Thus, \(\small{RV}\) (like \(\small{r}^{2}\)) is a ratio of between-block relative to within-block variation

2: Integration Among Modules: The RV Coefficient

\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\] Range of \(\small\mathbf{RV}\): \(\small{0}\rightarrow{1}\)

Note: ‘tr’ signifies trace (sum of diagonal elements)

The RV coefficient is analogous to \(\small{r}^{2}\) but it is not a strict mathematical generalization

\(\small{r}^{2}=\frac{\sigma^{2}_{xy}}{\sigma^{2}_{x}\sigma^{2}_{y}}\) vs. \(\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\)

The numerator of \(\small{r}^{2}\) & \(\small{RV}\) describes the covariation between \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)

The denominator of \(\small{r}^{2}\) & \(\small{RV}\) describes variation within \(\small\mathbf{Y}_{1}\) & \(\small\mathbf{Y}_{2}\)

Thus, \(\small{RV}\) (like \(\small{r}^{2}\)) is a ratio of between-block relative to within-block variation

However, because each \(\small\mathbf{S}\) is a covariance matrix, the sub-components of \(\small\mathbf{RV}\) are squared variances and covariances: not variances as in \(\small{r}^{2}\): \(\tiny\text{hence, range of } \mathbf{RV}={0}\rightarrow{1}\)

It is not at all obvious what a ‘squared covariance’ represents in a statistical sense, as variances are already summed squared deviations from the mean (see Bookstein. Evol. Biol. 2016)

2: Integration Among Modules: Partial Least Squares

Another way to summarize the covariation between blocks is via Partial Least Squares (PLS)

Decomposing the information in \(\small\mathbf{S}_{12}\) to find rotational solution (direction) that describes greatest covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)

\[\small\mathbf{S}_{12}=\mathbf{U\Lambda{V}}^T\]

Ordination scores found by projection of centered data on vectors \(\small\mathbf{U}\) and \(\small\mathbf{V}\)

\[\small\mathbf{P}_{1}=\mathbf{Y}_{1}\mathbf{U}\]

\[\small\mathbf{P}_{2}=\mathbf{Y}_{2}\mathbf{V}\]

The first columns of \(\small\mathbf{P}_{1}\) and \(\small\mathbf{P}_{2}\) describe the maximal covariation between \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\)

The correlation between \(\small\mathbf{P}_{11}\) and \(\small\mathbf{P}_{21}\) is the PLS-correlation

\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]

Significance is assessed via permutation

Bookstein et al. (2003). J. Hum. Evol.

Integration using RV: Example 1

Pecos pupfish (shape obtained using geometric morphometrics)

Is there an association between head shape and body shape?

Integration using RV: Example 1

Pecos pupfish (shape obtained using geometric morphometrics)

Is there an association between head shape and body shape?

\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}=0.607\]

\[\small\sqrt{RV}=0.779\]

Integration using PLS: Example 1

\(\tiny{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}=0.607\) and \(\tiny\sqrt{RV}=0.779\)

\(\small{r}_{PLS}={cor}_{P_{11}P_{21}}=0.916\)

Evaluating Multivariate Associations

We now have two potential test measures of multivariate correlation

\[\small{RV}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}_{11}\mathbf{S}_{11})tr(\mathbf{S}_{22}\mathbf{S}_{22})}}\]

\[\small{r}_{PLS}={cor}_{P_{11}P_{21}}\]

Is one approach preferable over the other?

Permutation Tests for Multivariate Association

Test statistics:

\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)

H0: \(\small\rho=0\)

H1: \(\small\rho>0\)

RRPP Approach:

1: Represent \(\small\mathbf{Y}_{1}\) and \(\small\mathbf{Y}_{2}\) as deviations from mean (H0)

2: Estimate \(\small\hat\rho=\sqrt{RV}_{obs}\) and \(\small\hat\rho={r}_{PLS_{obs}}\)

3: Permute rows of \(\small\mathbf{Y}_{2}\), obtain \(\small\hat\rho=\sqrt{RV}_{rand}\) and \(\small\hat\rho={r}_{PLS_{rand}}\)

4: Repeat many times to generate sampling distribution

Permutation Tests for RV and rPLS: Example

Test statistics:

\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)

H0: \(\small\rho=0\)

H1: \(\small\rho>0\)

For the pupfish dataset, both are significant at p = 0.001

Permutation Tests for RV and rPLS: Example

Test statistics:

\(\small\hat\rho=\sqrt{RV}\) and \(\small\hat\rho={r}_{PLS}\)

H0: \(\small\rho=0\)

H1: \(\small\rho>0\)

Compare permutation distributions with one another (minus observed in this case)

plot(RV.rand[-1], pls.rand[-1], xlim=c(0,.65), ylim=c(0,.65), xlab="sqrt(RV)", ylab="r-pls")
abline(a=0,b=1, col="red",lwd=2)

All things considered, rPLS performs better.

Integration using PLS: Example 2

3: Global Integration

3: Global Integration

3: Global Integration: Conceptual Considerations

Bookstein. (2015). Evol. Biol.

3: Global Integration and Self-Similarity

##      BEval 
## -0.6369197

4: Assessing Modularity

4: Assessing Modularity

4: Assessing Modularity: \(\small{RV}\) Coefficient

With random MVN data, \(\small\mathbf{RV}\) varies with both n and p!

Adams. (2016). Methods Ecol. Evol.

4: Assessing Modularity: \(\small{CR}\) Coefficient

\[\small{CR}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}^*_{11}\mathbf{S}^*_{11})tr(\mathbf{S}^*_{22}\mathbf{S}^*_{22})}}\]

4: Assessing Modularity: \(\small{CR}\) Coefficient

\[\small{CR}=\frac{tr(\mathbf{S}_{12}\mathbf{S}_{21})}{\sqrt{tr(\mathbf{S}^*_{11}\mathbf{S}^*_{11})tr(\mathbf{S}^*_{22}\mathbf{S}^*_{22})}}\]

\(\small{CR}\) Coefficient: Statistical Properties

\(\small{CR}\) Coefficient: Examples

5: Comparing Patterns Across Datasets

5: Comparing Patterns Across Datasets

5: Comparing Patterns Across Datasets

where: \(\small\mathbf{Z}=\frac{r_{PLS_{obs}}-\mu_{r_{PLS_{rand}}}}{\sigma_{r_{PLS_{rand}}}}\)

Adams and Collyer. (2016). Evol.

5: Comparing Integration Across Datasets: example

5: Comparing Modularity Across Datasets

Adams and Collyer. (in press). Evol.

5: Comparing Modularity Across Datasets: example

Adams and Collyer. (in press). Evol.

Integration and Modularity: Perspectives